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The clostridial neurotoxins (CNTs), comprised of tetanus neurotoxin 
(TeNT) and the seven serotypes of botulinum neurotoxin (BoNT A-G), 
specifically bind to neuronal cells and disrupt neurotransmitter release by 
cleaving proteins involved in synaptic vesicle membrane fusion. In this 
study, multiple CNT sequences were analyzed within the context of the 
1277 residue BoNT/ A crystal structure to gain insight into the events of 
binding, pore formation, translocation, and catalysis that are required for 
toxicity. A comparison of the TeNT-binding domain structure to that of 
BoNT/ A reveals striking differences in their surface properties. Further, 
the solvent accessibility of a key tryptophan in the C terminus of the 
BoNT/A-binding domain refines the location of the ganglioside-binding 
site. Data collected from a single frozen crystal of BoNT/ A are included 
in this study, revealing slight differences in the binding domain orien- 
tation as well as density for a previously unobserved translocation 
domain loop. This loop and the conservation of charged residues with 
structural proximity to putative pore-forming sequences lend insight into 
the CNT mechanism of pore formation and translocation. The sequence 
analysis of the catalytic domain revealed an area near the active-site 
likely to account for specificity differences between the CNTs. It revealed 
also a tertiary structure, highly conserved in primary sequence, which 
seems critical to catalysis but is 30 A from the active-site zinc ion. This 
observation, along with an analysis of the 54 residue "belt" from the 
translocation domain are discussed with respect to the mechanism of 
catalysis. 
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Introduction 

The Clostridium botulinum organism produces 
seven immunologically distinct forms of botulinum 
neurotoxin (BoNT), designated A-G, from different 
strains (Simpson, 1989). BoNT/E and BoNT/F 
have been observed from Clostridium butyricum 
and Clostridium baratti, respectively (Aureli et ah, 
1986; Hall et ah, 1985). Each toxin is synthesized as 
an inactive ~150 kDa single-chain protein. The 
protein is post-translationally proteolyzed to form 
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the active dichain molecule in which the two 
chains, ~50 and ~100 kDa, remain linked by a 
disulfide bond (Figure 1(a)). The active dichain 
molecule is comprised of three ~50 kDa functional 
domains: binding, translocation, and catalytic 
(Montecucco & Schiavo, 1995). The binding 
domain comprises the C-terminal half of the 
~100 kDa chain, while the translocation domain is 
located in its N-terminal half. The catalytic domain, 
a zinc-endopeptidase, is confined to the N-terminal 
50 kDa chain. 

These domains correlate with a three-step model 
for toxicity (Simpson, 1980). In the first step, the 
binding domain mediates interaction between 
the toxin and the presynaptic nerve terminal 
membrane (Dolly et ah, 1984). This interaction is 
thought to occur through both a ganglioside and 
protein receptor (Montecucco, 1986). Following 
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Figure 1. (a) The 150 kDa toxin is post-translationally proteolyzed to form the activated dichain molecule (the 
arrow in the top part of the Figure illustrates where proteolysis occurs). Two disulfides exist, one that links the two 
chains and the other in the C-terminal half of the binding domain. The three functional domains are each ~50 kDa 
and correspond to the catalytic domain (1-437), the translocation domain (448-872), and the binding domain (873- 
1295), where the numbers refer to the BoNT/A sequence, (b) A backbone trace of the BoNT/A structure with the 
catalytic domain in purple, the translocation domain in green, the N-terminal binding subdomain in pink, and 
the C-terminal binding subdomain in blue. The catalytic zinc ion is represented as a sphere with the helix containing 
the HEXXH zinc-binding motif in red. Helix a22 of the C-terminal binding subdomain is colored orange to 
show the relative orientation of the putative ganglioside-binding site to the rest of the molecule. The structure rep- 
resents residues 1-431 and 450-1295, with residues 432-437 and residues 448-449 at the site of proteolytic cleavage 
being disordered. This Figure was generated using MOLSCRIPT and Raster3D (Kraulis, 1991; Merritt & Bacon, 1997). 



binding, the protein is internalized by receptor- 
mediated endocytosis (Black & Dolly, 1986a,b). The 
second step is triggered by the lower pH of the 
endosome. Acidic pH is thought to cause a struc- 
tural change in the translocation domain, allowing 
it to form a pore in the membrane. This pore 
allows for the translocation of the catalytic domain 
across this membrane, gaining access to its cytoso- 
lic target. The third step involves the cleavage of 
one of three proteins involved in synaptic vesicle 
membrane fusion. BoNT/B, BoNT/D, BoNT/F, 
and BoNT/G cleave distinct sites within a vesicle- 
associated membrane protein (VAMP, also referred 
to as synaptobrevin) (Schiavo et ah, 1992, 1993a,c, 
1994; Yamasaki et ah, 1994a,b). BoNT/A and 
BoNT/E recognize and cleave distinct sites near 
the C terminus of SNAP-25 (synaptosomal-associ- 
ated protein of 25 kDa) (Binz et ah, 1994; Blasi et ah, 
1993a; Schiavo et ah, 1993a,b), while BoNT/Cl 
cleaves syntaxin and SNAP-25 (Blasi et ah, 1993b; 
Foran et ah, 1996). VAMP, SNAP-25, and syntaxin 
are collectively termed the SNARE proteins and 
have been shown to interact in a four-helix coiled- 
coil in a step thought to precede synaptic vesicle 
membrane fusion (Sutton et ah, 1998). Cleavage of 
any one of these three proteins prior to the coiled- 
coil formation disrupts neurotransmitter release at 
the neuro-muscular junction. The disruption leads 
to the flaccid paralysis observed in the disease 
botulism. 



TeNT is produced by Clostridium tetani but 
shares ~65 % sequence homology and ~35 % iden- 
tity with the BoNT serotypes. Tetanus toxin is a 
150 kDa dichain molecule, which is thought to 
undergo a three-step mechanism of binding, trans- 
location, and catalysis. The key difference is that 
w^hile it binds at the presynaptic nerve terminal 
ending, it undergoes retrograde transport up the 
axon to act in the central nervous system. The mol- 
ecular determinant of this alternate localization is 
unknown but is thought to lie in differences in the 
protein receptor. TeNT cleaves a site in VAMP, 
identical with that cleaved by BoNT/B (Schiavo 
et ah, 1992). However, given the different localiz- 
ation, the effect of this cleavage is to inhibit the 
release of inhibitory neurotransmitter and causes 
the spastic paralysis seen in tetanus poisoning. 

The structure of BoNT/ A was recently solved to 
3.3 A resolution using data collected from multiple 
crystals at 4 C (Lacy et ah, 1998). The structure 
(Figure 1(b)) showed that the binding domain was 
structurally similar to the TeNT binding domain 
(Umland et ah, 1997), and could be divided into 
two subdomains, an N-terminal (3-barrel and a 
C-terminal (3-trefoil fold. The translocation domain 
fold is markedly different from the folds observed 
in other toxins that undergo pore formation and 
translocation (Lacy & Stevens, 1998). Most notable 
are a kinked pair of a-helices, 105 A in length, 
and a 54 residue "belt" that wraps around the 
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perimeter of the catalytic domain. The transloca- 
tion domain occludes access to a large, negatively 
charged cleft leading into the active-site zinc ion of 
the catalytic domain. The zinc ion appears to be 
directly coordinated by His222, His226, Glu261 
and by a water-mediated coordination through 
Glu223. 

The CNT sequence divergence is substantial 
when one considers that the family of proteins has 
maintained striking similarities in a complex series 
of functions. In spite of the divergence though, it is 
reasonable to presume that the Bo NT/ A structure 
will be representative for all of the CNTs. Firstly, 
there is a greater conservation in predicted second- 
ary structure than in the primary structure (Lebeda 
& Olson, 1994). Secondly, the structure of the bind- 
ing domain overlays with the tetanus neurotoxin- 
binding domain with a root-mean-square deviation 
(rmsd) of 1.5 A for 363 Cot atoms. Lastly, the func- 
tional requirements, especially those of acid- 
triggered pore formation and translocation, are rig- 
orous and are not likely to tolerate significant 
structural changes. It is tempting to think that 
viewing the primary structure differences within 
the tertiary structure will lend insight into the mol- 
ecular determinants of differences between the 
CNTs. The key differences seem to be an ability to 
discriminate between protein receptor molecules 
and in the enzyme specificity for one of eight clea- 
vage sites in the three SNARE proteins. An analy- 
sis of the sequence conservation can also help 
identify regions required in the common functions 
of ganglioside-binding, pore formation, transloca- 
tion, and catalysis. 



Results and Discussion 

The binding domain 

The structure of the TeNT-binding domain has 
been solved by independent investigators to 2.7 A 
(Urn land et ah, 1997; PDB code laf9) and 1.5 A 
(PDB accession code 1A8D). The main chains of 
the structures differ only in the orientation of two 
surface loops, located in crystal packing interfaces. 
As both structures were obtained in the same 
space group, these packing interface differences, 
along with the temperature and X-ray source 
differences are likely to account for the differences 
in resolution. The structure of BoNT/A at 4°C, 
and now — 170 °C, reveals the orientation of the 
binding domain and the accessibility of the surface 
loops in the presence of the translocation domain 
(in comparison to the TeNT-binding domain struc- 
ture). The binding domain tilts away from the 
plane of the catalytic and translocation domains by 
~60°. It projects away from the long axis of the 
translocation domain by ~40° in the structure 
from 4°C data and by ~45° in the structure 
refined into frozen data. This observation is not 
surprising, given the relatively small interface 
between the binding and translocation domains. 




Figure 2. (a) The conserved residues of the BoNT/A- 
binding domain are colored purple, indicating that the 
N-terminal sub-domain (top) is more highly conserved 
than the C-terminal sub-domain (bottom). The second 
disulfide in BoNT/A, green, is located behind oc22. This 
helix, with residues W1265 and Q1269 colored red, 
is thought to form part of the ganglioside-binding site. 

(b) A superposition of the BoNT/A (yellow) and TeNT 
(red) binding domains at the ganglioside-binding site 
and a loop difference between the two structures. 

(c) The two subdomains of the BoNT/ A-binding domain 
are connected by a21 and form an electrostatically 
positive cleft in their interface, (d) The cleft in TeNT is 
relatively shallow and neutral, largely due to the loop 
difference shown in (b). 



(The binding domain buries ~400 A 2 of the trans- 
location domain, while the translocation domain 
buries ~480 A 2 of the binding domain.) The rela- 
tive orientation of this domain could vary even 
more substantially under physiological conditions. 
The two sub-domains of the binding domain are 
linked by an a-helix (tx21) and create a cleft in their 
interface (Figure 2(a) and (c)). The C-terminal 
subdomain buries ~500 A 2 of the N-terminal 
subdomain, while the N-terminal subdomain 
buries ~540 A 2 of the C-terminal domain. The 
contacts between the two subdomains are made 
through loops such that this interface could also be 
flexible, although the subdomain (3-strands align 
almost identically in the three structures. 
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Figure 3 (legend shown on page 1096) 



The alignment of BoNT/A with TeNT and the 
six other BoNT serotypes from the three identified 
organisms is shown along with the secondary 
structure assignments and solvent accessibility of 
residues in the BoNT/ A structure in Figure 3. 



There is significantly greater sequence homology 
within the N-terminal binding subdomain as com- 
pared to the C-terminal subdomain (Figures 2(a) 
and 3). The majority of these highly conserved resi- 
dues in the N-terminal subdomain, however, point 
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Figure 3 (legend shown on page 1096) 



inward, seeming to preserve the hydrophobic core 
of the (3-barrel. As most of the antigenicity confer- 
ring serotype uniqueness seems to arise in 
the binding domain (Chen et ah, 1997), it is not 



surprising that its surface residues vary dramati- 
cally among the CNTs. 

While the existence of a protein receptor binding 
site is still under investigation, the ganglioside- 
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Figure 3 The alignment is displayed to show strict sequence conservation in white letters and a red background, 
and strong sequence conservation in red letters. The secondary structure elements of the BoNT/A structure are 
labeled a (a-helix), n (3 10 helix), p (P-strand) and TT (turn). Assignments were made using DSSP (Kabsch & Sander, 
1983) but slightly modified to include helix 20 and strands 31, 40, and 41 after visual inspection. The solvent-accessi- 
bility of each residue in the BoNT/A structure is indicated in the bar at the base of the sequences, with white 
representing buried residues, dark blue representing solvent-accessible residues and light blue representing an inter- 
mediate value. This Figure was generated using ESPript (http:/ /www.ipbs.fr/ESPript). 



binding site is better understood. A binding study 
using trypsin digests in BoNT/A indicates that 
the 30 C-terminal residues of the C-terminal 
subdomain are involved in binding the GTlb 
ganglioside (Shone et ah, 1985). A photoaffinity 
labeling study using a novel ganglioside probe 
implicates this same region in TeNT, showing 



labeling of H1295 (Shapiro et ah, 1997). The func- 
tional significance of this region has been identified 
in BoNT/E (Kubota et ah, 1997). A monoclonal 
antibody capable of neutralizing BoNT/E in mice 
also bound the peptide YLTHMRD. Comparison of 
this sequence location in BoNT/E aligns to resi- 
dues 1292-1298 of TeNT and residues 1266-1272 of 
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BoNT/A (Figure 3). Most recently, a binding assay 
that followed tryptophan fluorescence in BoNT/ A 
showed fluorescence quenching upon binding the 
ganglioside (Kamata et ah, 1997). However, with 
three tryptophan residues within the 30 C-terminal 
residues it was impossible to narrow the binding 
site further. The BoNT/A structure shows that of 
the three tryptophan residues, only one (W1265) is 
solvent-accessible (Figure 3). This highly hydro- 
phobic residue is fully exposed to solvent and 
makes contact with Q1269, the residue in BoNT/A 
that aligns to HI 295 of TeNT, implicated in the 
photoaffinity labeling study. These two residues 
are located in the i and i + 4 positions of helix 22, 
beneath a loop whose orientation varies dramati- 
cally between BoNT/A and TeNT (Figures 1(b) 
and 2(b)). While in BoNT/A this loop points out, 
creating a deep positively charged cleft between 
sub-domains, the loop in TeNT folds in, creating a 
shallow, more neutral cleft. This surface difference 
could have a role in the alternative localization of 
the two toxins. 



The translocation domain 

The translocation domain is able to form chan- 
nels in artificial bilayers (Blaustein et ah, 1987; 
Donovan & Middlebrook, 1986; Hoch et ah, 1985) 
and in cell membranes (Sheridan, 1998). Visualiza- 
tion of these channels using electron cryomicro- 
scopy suggests that the channels may be formed 
by the oligomerization of BoNT to form a tetramer 
(Schmid et ah, 1993). Efforts to identify the pore- 
forming segment(s) of the CNTs have focussed on 
identifying amphipathic sequences capable of 
spanning the membrane (Lebeda & Olson, 1995; 
Montal et ah, 1992). Three such sequences were 
identified using the MOMENT algorithm for 
hydrophobic moments (595-614, 625-647, and 648- 
691) (Lebeda & Olson, 1995). A peptide represent- 
ing part of one of these sequences, 659-681, was 
shown to form channels in planar lipid bilayers 
(Oblatt-Montal et ah, 1995). This sequence could 
possibly oligomerize to form a four-helix bundle in 
the membrane. The structure of BoNT/ A, solved at 
pH 7, does not refute or support this hypothesis. 
The previously identified amphipathic sequences 
do not correspond to the long pairs of kinked 
a-helices observed in the translocation domain. 
Instead, the sequences precede these helices and 
adopt primarily extended loop conformations 
(Figure 4(a)). 

The conservation among the CNTs in the trans- 
location domain is shown by sequence in Figure 3 
and structurally in Figure 4(c). While virtually no 
sequence conservation is observed in the region of 
the belt, the sequence conservation is fairly high 
throughout the rest of the translocation domain. 
The majority of the conserved residues are evenly 
spaced along the long helices (al4, al5, al6, and 
al7). The organization seems to preserve the 
packing between each other and between the smal- 




Figure 4. (a) and (b) show identical orientations of the 
BoNT/ A translocation domain and are related to (c) 
and (d) by a rotation of ~180 ° around the long axis of 
the helices. In (c) and (d), the translocation domain belt 
is protruding out of the plane of the page. In (a), H551 
is shown in purple pointing toward the disulfide (yel- 
low) linking the translocation (green) and catalytic 
(shown partially in blue) domains. The three amphi- 
pathic sequences possibly involved in pore formation 
are colored orange and red, with conserved residues 
colored red. These sequences form part of a conserved 
electrostatic cluster shown with an arrow in (b). (c) The 
overall sequence conservation in the translocation 
domain structure. The conserved residues, shown in 
purple, act largely to hold the helical arrangement 
together, the exception being the conserved residues in 
the extended loops of the amphipathic sequences, 
shown in (a). The arrow in (d) points to a cluster of 
charges conserved at the interface between the main 
body of the translocation domain and the beginning of 
the translocation domain belt, (a) and (c) were made 
using MOLSCRIPT (Kraulis, 1991), while the electro- 
static potential surfaces shown in (b) and (d) were made 
using GRASP (Nicholls et ah, 1991). 



ler helices at either end of the domain (al2 and 
al8). The only area outside of these helices of 
particular note is the region from 622-644. This 
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sequence is largely loop, with one 3 10 helix (nlO) 
and is part of an amphipathic sequence, 625-647, 
previously identified (Lebeda & Olson, 1995). The 
loop extends away from the domain pointing 
toward the binding domain but is not involved in 
any inter-domain contacts. To our knowledge, this 
sequence has not been investigated for its ability to 
form pores in membranes. 

The critical follow-up question after identifying 
the pore-forming segment is to identify the 
molecular mechanism by which pH triggers this 
sequence to change structure and form a mem- 
brane-spanning channel. The translocation 
domain is geared to sense this environmental 
variable, as the calculated pi for its sequence is 
4.66 (as compared to 9.1 for the binding domain 
and 6.0 for the whole toxin). This low value is 
attributable to the first half of the domain con- 
taining the amphipathic loops (548-685, pi 4.6) 
as opposed to the second half containing the 
kinked pair of a-helices (686-872, pi 8.5). The 
three-dimensional charge distribution indicates 
several clusters of negative charges on both 
sides of the surface (Figure 4(b) and (d)). One 
cluster (Figure 4(b)) is comprised of D612, D615, 
E616, E619, and E588. Of these residues, D612 is 
strictly conserved, E616 and E588 are largely 
conserved, and E619 is consistently polar. The 
clustering of negative charge is expected to raise 
the pK a of these residues, making them titratable 
at endosomal pH (4.5-5.5). We therefore propose 
that these residues and possibly any histidine 
residues will effect the structural changes 
required for pore formation. 

The translocation domain contains only two 
histidine residues (H551 and H560) located in 
and after otlO, respectively, such that they 
immediately follow the translocation domain belt 
(492-545) and precede a disordered loop in the 
structure. While H560 is solvent-accessible, H551 
is buried at the interface of the catalytic and 
translocation domains within 7 A of their con- 
necting disulfide. The idea that H551 protonation 
could disrupt this interface is supported by the 
fact that the loop following H551 is disordered 
in the crystal structure. The original data for 
BoNT/A showed high B-factors for residues 560- 
585, with 561-568 completely disordered. The 
newer data, collected from a single frozen crys- 
tal, revealed density in this area, although the B- 
factors for residues placed in this density remain 
high (Figure 4(a)). This region likely represents 
an area of inherent flexibility and is proximal 
both to the negatively charged cluster and the 
buried histidine residue. This feature could facili- 
tate the structural changes that accompany pore 
formation as well as the translocation of the 
unfolded catalytic domain through the pore. The 
role that the belt might play in translocation is 
unknown, although with movement in this loop 
region it is not implausible to consider that the 
belt could move as well. A second electrostati- 
cally charged cluster on the other side of the 



translocation domain (Figure 4(d)) might also 
contribute to belt movement. This cluster is 
formed by both positive and negative residues 
of otl4 (K700, E703, K704, D706, E707, 
and K710), along with negative charges at the 
beginning of the belt (D473, E478, E487, and 
E490). Protonation at this interface could inter- 
rupt interactions holding the belt to the main 
body of the translocation domain thus allowing 
for the translocation and release of the catalytic 
domain. While the domains are tethered by a 
disulfide and the interface between the catalytic 
and translocation domains (not including the 
belt) includes ~1190 A 2 of buried surface area, 
the non-covalent contacts between the belt and 
catalytic domain are not substantial. The 54 resi- 
due belt buries an additional ~1620 A 2 of the 
catalytic domain surface area. This translates to 
~30 A 2 per residue of the belt, considerably 
lower than the ~200 A 2 buried by an isoleucine 
residue in a well-packed protein core. The 
energy required for removing the belt is prob- 
ably even lower than that indicated by the bur- 
ied surface area, since the belt is presumably 
unstructured in the absence of the catalytic 
domain, thus adding an entropic cost to their 
association. 



The catalytic domain 

Previous alignments have indicated that the cat- 
alytic domains share up to 36.5% sequence iden- 
tity, the exception being the catalytic domains of 
BoNT/B and TeNT, which share 51.6% sequence 
identity (Kurazono el al, 1992). All sequences con- 
tain a HEXXH sequence motif, typical of many 
zinc proteases. The catalytic domain sequence 
alignment also shows the strict conservation of two 
Asp-Pro bonds, one immediately preceding the 
HEXXH sequence (D215 and P216) and the other 
at the N terminus of the molecule (Dll and P12). 
A third Asp-Pro bond is present in all of the 
sequences, except BoNT/A, where S74 exists 
instead of proline. Asp-Pro bonds from BoNT/A 
have been shown to hydrolyze at pH 5 (DasGupta 
& Evenson, 1992; DasGupta & Tepp, 1991), a fact 
that may bear relevance given the low pH environ- 
ment of the endosome and the need to translocate 
a 50 kDa protein. It should be noted, however, that 
a truncation after residue 9 inactivates in vitro 
catalysis (Kurazono el al., 1992). 

Other than the zinc protease HEXXH sequence 
motif, the catalytic domains share no sequence 
similarity with proteins outside of the CNT family. 
While a DALI search showed that thermolysin, a 
well-characterized zinc protease, bore the highest 
structural identity with the BoNT/A catalytic 
domain, this similarity was weak with a Z-score of 
4.6 for 139 residues (Lacy et al, 1998). Neverthe- 
less, a visual comparison of the two enzymes 
(Figure 5) is useful, because, like BoNT/ A, thermo- 
lysin coordinates the active-site zinc ion with two 
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Figure 5. A comparison of 
(a) thermolysin and (b) the BoNT/ 
A catalytic domain. The helix con- 
taining the HEXXH zinc protease 
motif is colored red, while the helix 
presenting the second Glu is 
colored gold. The topology of the 
secondary structural elements dif- 
fers such that in BoNT/ A, the 
shorter green and gold helices 
come together to form the struc- 
tural equivalent of the single gold 
helix in thermolysin. Loops B, C, 
and D of BoNT/A are colored 
yellow, pink, and orchid, respect- 
ively, and form boundaries of 
accessibility to the active-site zinc 



histidine residues (Hisl42 and Hisl46), a glutamate 
residue (Glul66), and a water-mediated glutamate 
residue (Glul43) (Colman et al, 1972). Structural 
similarities include the helix containing the 
HEXXH sequence (oc3) and a four-stranded p-sheet 
(p3, p6, p7, and p8). This accounts for the structural 
presentation of the two histidine ligands and the 
glutamate residue that coordinates the activated 
water in catalysis. The presentation of the fourth 
ligand is intriguing, as thermolysin presents the 
Glul66 ligand with a single helix, while BoNT/A 
has generated a similar presentation by folding 
with two smaller helices (oc4 and a8) end-to-end. 
As the two BoNT/ A helices point in opposite 
directions, it is uncertain if maintaining this helical 
periodicity plays a role in the Glu261 presentation. 
More interestingly, it supports this activity as a 
likely example of convergent evolution. 

The fold comparison highlights the likely cleft 
by which the substrate gains accessibility to the 



active-site. This channel of accessibility is bordered 
by three loops in BoNT/A designated B (47-80), C 
(231-259), and D (356-371) (Figure 5(b)). The loops 
in BoNT/A are longer than those in thermolysin 
and result in a more buried active-site, although 
these loops could be flexible or oriented differently 
in the absence of the translocation domain. A 
second avenue of active-site access would be for 
substrate to enter from a channel seen arially both 
in Figure 5(b) and Figure 6(a). The presence of the 
translocation domain belt and the main body of 
the translocation domain (Figure 6(b)) occlude both 
channels of accessibility in the holotoxin. 

The conserved residues of the catalytic domain 
were mapped onto the three-dimensional structure 
of BoNT/ A and are shown in Figure 7. Conserved 
residues within 6 A of the active-site cavity are 
depicted with their full side-chains. In addition to 
the anticipated residues of the HEXXH motif 
(H222, E223, and H226), both E260 and E261 
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Figure 6. A CPK model of (a) the 
Bo NT /A catalytic domain and 
(b) the Bo NT /A holotoxin showing 
how accessibility to the buried 
active-site zinc ion (orange) is 
occluded both by the translocation 
domain belt and the long axis of 
the translocation domain (green). 
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(where E261 can coordinate the zinc ion) and R362 
and Y365 of loop D are located within this proxi- 
mity. These seven residues are located in a plane at 
the base of the active-site channel, thus making it 
reasonable to assume that their conservation 
preserves the general zinc endopeptidase activity. 
The specificity is therefore likely to arise from the 
residues forming the channel, or cavity, above this 
base. This cavity is at least 1075 A 3 in the presence 
of the translocation domain. Non-conserved cataly- 
tic domain residues within 4 A of this cavity 
include 65-68 of loop B, 161-163, 193, 219, 238 of 
loop C, 255-258, and 368-9 of loop D. These resi- 
dues were analyzed in smaller subsets, particularly 
BoNT/B and TeNT, for sequence conservation, but 
none was readily apparent. Residues 459, 530, 531, 
533, and 536 of the translocation domain also 
border the active-site cavity where P531 is con- 
served in all ten sequences and is the only strictly 
conserved residue of the belt. These residues, 
which occlude access to the active-site in the holo- 
toxin form, are presumably absent following trans- 



location through the membrane. The catalytic 
domain may adopt an altered conformation follow- 
ing its translocation, a possibility that should be 
addressed by the structure determination of this 
domain in the absence of the neurotoxin. 

While some level of sequence conservation 
was expected in the vicinity of the active-site, 
the appearance of conservation in two surface 
loops, distant from the active-site was surprising. 
Loop A (1-18) is comprised of conserved resi- 
dues F7, N8, Y9, Dll, and P12, and is close to 
conserved residues of (31, (32, (33, (36, as well as 
the N terminus of al (Figure 8). The importance 
of this loop was identified in experiments 
measuring the toxicity of N-terminal and C-term- 
inal catalytic domain deletions (Kurazono et al., 
1992). While a deletion of residues 1-7 was toler- 
ated, the deletion of residues 1-9 was not, impli- 
cating a direct biochemical and /or indirect 
structural role for one or both of residues N8 
and Y9. While N8 is largely solvent-accessible, 
Y9 forms part of a conserved charged pocket 
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Figure 8. The BoNT/A catalytic 
domain is shown in purple with 
residues 1-88 colored yellow. Resi- 
dues 450-550 of the translocation 
domain are colored green to show 
the relative position of the translo- 
cation domain belt. The Asp-Pro 
residues are colored red, while the 
conserved cluster of residues 
around Tyr9 is shown in purple. 
The backbone corresponding to the 
three proline residues P60-P62 is 
colored green and shows the rela- 
tive proximity to the active-site 
zinc ion (gray sphere). Labels are 
underneath their corresponding 
secondary structural element. 



containing K33 ((32), E46 ((33), D80 (al), and K83 
(al) (Figure 8). Since D80 and K83 are contained 
within one end of al and the other end of al 
has contact with a3, the HEXXH helix, it is 
possible that this conservation is merely to pre- 
serve the orientation of these helices, and there- 
fore, the structural integrity of the catalytic 
machinery. This possibility seems unlikely, how- 
ever, given the 30 A distance of Y9 from the 
zinc ion and the lack of sequence conservation 
in secondary structural elements more proximal 
to the HEXXH helix (for example, (37 and (38). It 
seems more likely that the conserved presence of 
Tyr in the charged pocket is preserving a remote 
tertiary structure critical for catalysis. This is 
supported by experiments showing that a mono- 
clonal antibody (mAb) mapping to residues 
27-52, when administered intraneurally, neutral- 
ized the effects of BoNT/A (Cenci Di Bello et al, 
1994). The solvent-accessible residues of this 
sequence and likely mAb binding site immedi- 
ately preceed and follow the strands (32 and (33, 
respectively. 

The second conserved surface loop, the last part 
of loop B, is also located within this proximity. It is 



intriguing that both loops A and B contain an Asp- 
Pro pair of residues, although the conservation in 
loop B did not extend to BoNT/A (Figure 8). 
While acidic pH is not required for catalysis, clea- 
vage in these sites may aid the efficiency of translo- 
cation. The conservation in the latter part of loop B 
is also of interest, as the preceding residues come 
into contact with the active-site cavity. An unusual 
sequence of three consecutive proline residues 
(P60, P61, and P62) is followed by residues 65-68 
(previously noted for their proximity to the 
channel). 

There is mounting evidence suggesting the 
existence of a remote recognition sequence 
necessary for catalysis. A ten residue sequence 
shared by all three of the SNARE proteins has 
been identified and termed the SNARE motif 
(Rossetto et al, 1994). This motif is thought to 
be required for efficient catalysis and is present 
at varying distances upstream of the cleavage 
site. A potentially related study shows that the 
addition of peptides in trans upstream and 
downstream of the TeNT cleavage site is able to 
activate the toxin (Cornille et al, 1997). The idea 
of exosite-dependent cleavage implies that a con- 
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formational change, in addition to simple 
removal of the translocation domain, may be 
required for catalysis. Perhaps the regions in 
BoNT/A that share high degrees of sequence 
conservation distal from the active-site represent 
such general activation domains. Alternatively, 
these conserved surface loops could represent 
oligomeric interfaces or be involved in localiza- 
tion of the toxin within the cytosol. 



Materials and Methods 

Sequence alignments and superposition on the 
BoNT/A structure 

Nine CNT sequences were aligned to that of BoNT/A 
using CLUSTAL W (Thompson et al, 1994). A gap open- 
ing penalty of 10, gap extension penalty of 0.05, and gap 
separation distance of 8 were used with the BLOSUM62 
matrix (Henikoff & Henikoff, 1996). This matrix was also 
used in the display in Figure 3, using a global score of 
0.15. The Risler matrix was used to generate per residue 
scores for the alignment (Risler et al., 1988). These scores 
were output in the B-factor column of the BoNT/A coor- 
dinates, facilitating the display of sequence conservation 
on the three-dimensional structure. Residues with scores 
68 and higher were colored purple, while those below 68 
were colored yellow. The number 68 was empirically 
determined to best match the display of sequence conser- 
vation seen in Figure 3. 

Toxin structure at -170°C 

The protein was concentrated to 5 mg/ml in 10 mM 
Hepes (pH 7), 100 mM KC1. This stock was mixed 1:1 
(v/v) with a mother liquor of 30% (v/v) PEG 600 and 
100 mM Tris (pH 7) and allowed to crystallize by hang- 
ing drop vapor diffusion. The crystal formed overnight 
and grew to a size of about 100 |xm x 200 (im x 520 \im 
over the course of a week. The crystal was washed 
briefly in a solution made from 2 \A of mother liquor and 
2 of 30 % PEG 600 and plunged into liquid nitrogen. 
Data were collected at SSRL beamline 9-1 and processed 
with DENZO, yielding data 99.4 % complete to 3.3 A 
with 98.5 % completeness in the last bin. The data were 
5.8-fold redundant with an R^ el&e of 8.5% overall. After 
orienting the molecule with AMoRe, the structure was 
readily refined into the frozen data using CNS, yielding 
a final R work of 26% and R free of 31 %. While the proces- 
sing and refinement statistics indicate that the quality is 
lower than the structure solved at 4°C, these data did 
bring in density for residues 561-568. 
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